Evaluation of phone lattice based speech decoding
نویسندگان
چکیده
Previously, we proposed a flexible two-layered speech recogniser architecture, called FLaVoR. In the first layer an unconstrained, task independent phone recogniser generates a phone lattice. Only in the second layer the task specific lexicon and language model are applied to decode the phone lattice and produce a word level recognition result. In this paper, we present a further evaluation of the FLaVoR architecture. The performance of a classical single-layered architecture and the FLaVoR architecture are compared on two recognition tasks, using the same acoustic, lexical and language models. On the large vocabulary Wall Street Journal 5k and 20k benchmark tasks, the two-layered architecture resulted in slightly but not significantly better word error rates. On a reading error detection task for a reading tutor for children, the FLaVoR architecture clearly outperformed the single-layered architecture.
منابع مشابه
Phone Synchronous Decoding with CTC Lattice
Connectionist Temporal Classification (CTC) has recently shown improved efficiency in LVCSR decoding. One popular implementation is to use a CTC model to predict the phone posteriors at each frame which are then used for Viterbi beam search on a modified WFST network. This is still within the traditional frame synchronous decoding framework. In this paper, the peaky posterior property of a CTC ...
متن کاملMinimum hypothesis phone error as a decoding method for speech recognition
In this paper we show how methods for approximating phone error as normally used for Minimum Phone Error (MPE) discriminative training, can be used instead as a decoding criterion for lattice rescoring. This is an alternative to Confusion Networks (CN) which are commonly used in speech recognition. The standard (Maximum A Posteriori) decoding approach is a Minimum Bayes Risk estimate with respe...
متن کاملAutomatic assessment of children's reading with the FLaVoR decoding using a phone confusion model
Reading skills of children can be improved with the help of automatic reading tutors (ART), i.e. interactive software with an appealing interface which supports and challenges the child in the reading task, provides instantaneous feedback and automatically assesses its reading skills. For this purpose, ARTs benefit from automatic speech recognition technology for tracking the child’s responses ...
متن کاملLanguage recognition using phone latices
This paper proposes a new phone lattice based method for automatic language recognition from speech data. By using phone lattices some approximations usually made by language identification (LID) systems relying on phonotactic constraints to simplify the training and decoding processes can be avoided. We demonstrate the use of phone lattices both in training and testing significantly improves t...
متن کاملCombining Lattice-Based Language Dependent and Independent Approaches for Out-of-Language Detection in LVCSR
In this paper, Out-Of-Language (OOL) detection problem is handled by both language dependent (LD) and language independent (LI) approaches. In the LD approach, a novel speech content and language joint recognition algorithm is proposed, which integrates a phone lattice-based vector space modeling language recognition (LRE) backend into the conventional speech decoding procedure. In the LI appro...
متن کامل